Web Data Extraction for Business Intelligence: The Lixto Approach
نویسنده
چکیده
Knowledge about market developments and competitor activities on the market becomes more and more a critical success factor for enterprises. The World Wide Web provides public domain information which can be retrieved for example from Web sites or online shops. The extraction from semi-structured information sources is mostly done manually and is therefore very time consuming. This paper describes how public information can be extracted automatically fromWeb sites, transformed into structured data formats, and used for data analysis in Business Intelligence systems.
منابع مشابه
The Lixto Systems Applications in Business Intelligence and Semantic Web
This paper shows how technologies for Web data extraction, syndication and integration allow for new applications and services in the Business Intelligence and the Semantic Web domain. First, we demonstrate how knowledge about market developments and competitor activities on the market can be extracted dynamically and automatically from semi-structured information sources on the Web. Then, we s...
متن کاملLixto – Price Intelligence Suite
IMPACT The Lixto Price Intelligence Suite is a solution that extracts price-comparison information from competitor online web channels and combines it with internal data sources so organizations gain greater visibility into the factors that might influence price. The suite works by navigating and extracting competitive product and pricing information from predefined online data sources before s...
متن کاملScalable Web Data Extraction for Online Market Intelligence
Online market intelligence (OMI), in particular competitive intelligence for product pricing, is a very important application area for Web data extraction. However, OMI presents non-trivial challenges to data extraction technology. Sophisticated and highly parameterized navigation and extraction tasks are required. On-the-fly data cleansing is necessary in order two identify identical products ...
متن کاملWeb Information Acquisition with Lixto Suite: A Demonstration∗
We demonstrate the Lixto Suite, a web data extraction and transformation software kit for retrieving and converting information from various sources to various customer devices. With the Lixto Suite, non-technical content managers can rapidly develop applications in the areas of M-Commerce, E-Commerce, content integration and corporate portals.
متن کاملIntelligent Wrapping from PDF Documents
Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. The semi-structured form of web pages, coupled with the availability of business-relevant data, has led to the availability of several established products on the market for wrapping data from the Web. One such approach is the Lixto me...
متن کامل